AITopics | small-scale model

Collaborating Authors

small-scale model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ToolExpander: Extending the Frontiers of Tool-Using Reinforcement Learning to Weak LLMs

Chen, Fu, Wang, Peng, Li, Xiyin, Li, Wen, Lei, Shichi, Xiang, Dongdong

arXiv.org Artificial IntelligenceOct-10-2025

Training Large Language Models (LLMs) with Group Relative Policy Optimization (GRPO) encounters a significant challenge: models often fail to produce accurate responses, particularly in small-scale architectures. This limitation not only diminishes performance improvements and undermines the potential of GRPO but also frequently leads to mid-training collapse, adversely affecting stability and final efficacy. To address these issues, we propose ToolExpander, a novel framework that advances tool-oriented reinforcement learning for resource-constrained LLMs through two key innovations:(1) Dynamic Multi-Round Hard Sampling, which dynamically substitutes challenging samples(those without correct outputs over 10 rollouts) with high-quality few-shot demonstrations during training, coupled with an exponential learning rate decay strategy to mitigate oscillations;(2) Self-Exemplifying Thinking, an enhanced GRPO framework that eliminates KL divergence and incorporates adjusted clipping coefficients, encouraging models to autonomously generate and analyze few-shot examples via a minimal additional reward (0.01).Experimental results demonstrate that ToolExpander significantly enhances tool-using capabilities in LLMs, especially in weaker small-scale models, improving both training stability and overall performance.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.07737

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Distilling Calibration via Conformalized Credal Inference

Huang, Jiayi, Park, Sangwoo, Paoletti, Nicola, Simeone, Osvaldo

arXiv.org Artificial IntelligenceJan-10-2025

Deploying artificial intelligence (AI) models on edge devices involves a delicate balance between meeting stringent complexity constraints, such as limited memory and energy resources, and ensuring reliable performance in sensitive decision-making tasks. One way to enhance reliability is through uncertainty quantification via Bayesian inference. This approach, however, typically necessitates maintaining and running multiple models in an ensemble, which may exceed the computational limits of edge devices. This paper introduces a low-complexity methodology to address this challenge by distilling calibration information from a more complex model. In an offline phase, predictive probabilities generated by a high-complexity cloud-based model are leveraged to determine a threshold based on the typical divergence between the cloud and edge models. At run time, this threshold is used to construct credal sets -- ranges of predictive probabilities that are guaranteed, with a user-selected confidence level, to include the predictions of the cloud model. The credal sets are obtained through thresholding of a divergence measure in the simplex of predictive probabilities. Experiments on visual and language tasks demonstrate that the proposed approach, termed Conformalized Distillation for Credal Inference (CD-CI), significantly improves calibration performance compared to low-complexity Bayesian methods, such as Laplace approximation, making it a practical and efficient solution for edge AI deployments.

artificial intelligence, machine learning, small-scale model, (14 more...)

arXiv.org Artificial Intelligence

2501.06066

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
North America > Canada > Ontario > Toronto (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Education (0.46)
Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression

Liu, Jiduan, Liu, Jiahao, Wang, Qifan, Wang, Jingang, Cai, Xunliang, Zhao, Dongyan, Wang, Ran Lucien, Yan, Rui

arXiv.org Artificial IntelligenceOct-24-2023

Large-scale pre-trained language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks. However, the massive size of these models poses huge challenges for their deployment in real-world applications. While numerous model compression techniques have been proposed, most of them are not well-suited for achieving extreme model compression when there is a significant gap in model scale. In this paper, we introduce a novel compression paradigm called Retrieval-based Knowledge Transfer (RetriKT), which effectively transfers the knowledge of LLMs to extremely small-scale models (e.g., 1%). In particular, our approach extracts knowledge from LLMs to construct a knowledge store, from which the small-scale model can retrieve relevant information and leverage it for effective inference. To improve the quality of the model, soft prompt tuning and Proximal Policy Optimization (PPO) reinforcement learning techniques are employed. Extensive experiments are conducted on low-resource tasks from SuperGLUE and GLUE benchmarks. The results demonstrate that the proposed approach significantly enhances the performance of small-scale models by leveraging the knowledge from LLMs.

computational linguistic, knowledge, small-scale model, (14 more...)

arXiv.org Artificial Intelligence

2310.15594

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Slovenia (0.05)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(17 more...)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Artificial intelligence to influence top tech trends in major way in next five years

#artificialintelligenceJun-29-2022, 21:02:23 GMT

Artificial intelligence will be the common theme in the top 10 technology trends in the next few years, and these are expected to quicken breakthroughs across key economic sectors and society, the Alibaba Damo Academy says. The global research arm of Chinese technology major Alibaba Group says innovation will be extended from the physical world to a mixed reality, as more innovation finds its way to industrial applications and digital technology drives a green and sustainable future. "Digital technologies are growing faster than ever," Jeff Zhang, president of Alibaba Cloud Intelligence and head of Alibaba Damo, said in a report released on Monday. "The advancements in digitisation, 'internetisation' and intelligence are redefining a digital world that is characterised by the prevalence of mixed reality. "Digital technology plays an important role in powering a green and sustainable future, whether it is applied in industries such as green data centres and energy-efficient manufacturing, or in day-to-day activities like paperless office."

alibaba damo, artificial intelligence, small-scale model, (12 more...)

#artificialintelligence

Country:

North America > United States (0.04)
Europe > United Kingdom (0.04)
Asia > China (0.04)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Energy > Renewable (0.72)
Energy > Power Industry (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (0.51)
Information Technology > Communications > Networks (0.47)

Add feedback